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| ABSTRACT | 


Finished products and manufacturing plants are some elements of the production system in the supply chain, and 
there are other manufacturing plants. They produce work in process and finished products and hold them in 
warehouses. So, they need to plan and control production and inventories. Isolated planning and control by 
different manufacturers increase inventories in them, and then they must plan and control integratory. This paper 
presents an iterative approach for solving the optimal control problem with bounded control variables. The 
projection function constructs the iterative method to approximate the control law. Employing the approximation 
of control law, the approximation of state and the co-state variables are obtained. For this purpose, we apply the 
Hamiltonian of the optimal control problem. From the Hamiltonian, the approximation of control law and then the 
approximation of state law is obtained. A simple example is given to compare the results with another published 
paper. Also, a case study on production planning in a three-stock reverse logistics system with deteriorating items 
is derived to show the method's performance. 
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1. Introduction 


The study of the linear quadratic optimal control problem (OCP) with linear systems has a 
history of over fifty years. Many attempts have been made to obtain a satisfactory solution 
based on different approaches. The application of Pontryagin’s maximum principle to OCP, as 
outlined by Naidu (2003) and Pontryagin et al. (1962), results in a system of coupled two-point 
boundary-value (TPBV) problems. Within the Dynamic Programming approach, the sufficient 
conditions for an optimal controller and the functional with prescribed derivative proposed in 
Kharatishvili (1961) lead to a set of partial differential equations called the Riccati Equation for 
the systems. Neural networks are also another approach that is desirable to use for researchers 
(Pooya et al., 2021; Effati et al., 2021). In these methods, the OCP changes to a system of 
equations and then by using some known neural networks such as Perceptron, the problem is 
solved. 

In optimal control problems, it is sometimes the case that control is restricted to be between 
a lower and an upper bound, called a bounded optimal control problem. Bang-bang optimal 
control problems are also in which the optimal control switches from one extreme to another 
(i.e., strictly never in between the bounds). Bounded optimal control problems also have many 
applications, such as modelling infected diseases (Sweilam and AL-Mekhlafi, 2021; Liu et 
al.,2022; Ojo et al., 2022; Kovacevic et al., 2022; Sweilam et al., 2020), tank reactor systems 
(Géllmann et al., 2009), production planning systems (Hedjar et al., 2015; Pooya and 
Pakdaman, 2019; 2017 and 2018), etc. One major hurdle in the path of bounded optimal control 
problems discovery is the solution approach which is not similar to the methods without control 
restriction. 

Motivated by the former discussion, we will present a novel method to solve delay and 
bounded optimal control problems. In this way, we applied the projection function to tackle the 
challenge of bounded control variables. We test the method on a case study to show our 
technique's performance. The case study is on production planning in a three-stock reverse 
logistics system with deteriorating items. The motivation of the paper can be summarized as 
follows: 

1. Use the projection method to solve the OCP. 

2. Solve a production planning problem modelled by an OCP. 

The paper is organized as follows. The next section dedicates to the problem formulation and 
optimality conditions for OCP. The iterative method is proposed in Section 3. The case study 


is presented in Section 4, and the paper is concluded in section 5. 
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2. Problem formulation and optimality conditions 


In this section, the problem formulation and the optimality conditions of the problem are stated 


in (1). Consider the OCP in the following form. 


1 17" 
min J = 5X" (t)Sx"(te) s I (xT (t)Qx(t) + ul (t)Ru(t)) dt 
0 
—_ (1) 
x = Ax(t) + Bu(t) 
x(0) = xq 

u(t) € K, t € [0, te] 
where x(t) and u(t) are piecewise continuous the state and the control vectors, respectively. 
Also, A and B are two matrices of appropriate dimensions and xp is the initial state. Moreover, 
K C R™ is a close set. The initial condition x(t = 0) = Xp is given. The terminal time ty is 
specified, and the final state x(t¢) is not specified. Furthermore, Q,S € IR*” is positive semi- 
definite and R € R™*" is positive definite. 
Now, we will state the optimality conditions of equation (1). Consider the following 


Hamiltonian equation for (1): 
H (x(t), A(t), u(t), t) = +xT(HQx(t) + FuT(t)Ru(t) + AT[Ax(t) + Bu(t)]. (2) 


Where A(t) is the state variable. Based on equation (2), the optimality conditions can be stated 


as follows: 
aH (3) 
i= an = Ax(t) + BR71(t)B™ (t)A(t) 
A= — 2% = —Qx(t) — ATA, 4) 
u(t) = arg pn (x(t), A(t), u(t), t), O<t<ty (5) 
A(tp) = Sx(tp), x(0) = Xo 


(6) 
Equations of (3)-(6) are known as a TPBV problem. The initial value of x(t) is x(0) = xo 
and the initial value of A(t) is A(te) = Sx(ts). 


3. Projection method for solving OCP 
Here, the projection method for solving the OCP is studied. 
Consider the optimality conditions of OCP (1) stated in equations (3)-(6). Assume that the 


equation (7) instead of equation (5) in optimality conditions: 
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u(t) — Px [u(t)— Z(u(t))|= 0, O<t < ty (7) 


where Px (.) is a projection map and is defined as: (Eshaghnezhad et al., 2022; Mansoori and 
Effati, 2019). 


Px (u) = argmin || u—v ll 
veK 


OH 


“Fue Note that, Px(. ) is a piecewise function. Here, some results about 


Also, Z(u(t)) = 


the Z(u(t)) are investigated. 


Lemma 1. Z(u(. )) satisfies the Lipschitz condition. 


dH _ 


Proof. As Z(u(t)) = — ETT 


R(t)u(t) + B7 (t)A(t), so the proof is obvious. 


Remark 2. According to the equations (3)-(6), when we want to obtain the solution to the 
problem, we should at first find w(t) form equation (5) and then substitute in equations (4) and 


(3) the co-state vector A(t) and state vector x(t) are obtained. 


Now, in the previous discussion, we are going to settle down some iterative schemes to find 
the solution to the problem. 


The projection method gives an iteration sequence of controls by the rule in equation (8): 
uk+1(t) = Px [ u's (t) — Z(uk ()| k = 01... (8) 


We use the notation Z(u*(t)) = —H,,(x* (t), u*(t), A*(t), t) where x*(t) and A*(t) are the 
solutions of the state and co-state equations, respectively, related to the control function u* (.) 


and Ug is an initial control approximation. We consider the grid points t; = ih, i = 0,1,...,N 
for N = a the initial approximation u? = up(t;), i = 0,1,...,N — 1, and the definition of the 


(k + 1) approximation is given in equation (9): 
uk+1(t) = Py [ us (t) — Z(uk ()], k = 0,1). (9) 


where Z(u}) = —H, (xf, AL ti) and xk, Ak are obtained after applying the Euler method 
to the state and co-state equations using the control approximations u‘ on the intervals [t,, tj41], 


t= 0; 1,.2,N = 1,10, 
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xy = xk +h (Axk (t) + Buk(t)), xf = x0, (10) 
(11) 
as = aia Tt hHy (xi Uisa Ae ties), AN = SExy. 

Note that, from the above equations, the state and co-state vectors are computed forward and 


backward, respectively. 


Remark 3 Based on Remark 2, u* is obtained from equation (9) and then x* and A* are 
provided in equations (10) and (11). Finally, by applying the obtained w* and x* the quadratic 


performance index can be calculated according to the equation (1): 
1 1, 7% 
= SeT eS +S | (@OTOADR*O + HITOR(OUND dt) (12) 
0 


For accuracy analysis, we consider the following criterion (equation (13)). The optimal 
control (9) has the desirable accuracy when for a given positive constant ¢, the following 
condition holds: 


k _ yk-1 
ins <é. (13) 


rae 


If the tolerance error bound € > 0 is chosen small enough, then the kth order optimal control 
law will be very close to the optimal control law u*(t), the value of the quadratic performance 
index in equation (12) will be very close to its optimal value J*, and the boundary state 
conditions will be satisfied tightly. 

The convergence analysis of the projection method is given in the following theorem. The 


proof was derived in Pulova (2009). 


Theorem 4. Let the sequence u* = (uf, uf,...,ux_1), ufeK, K OR™, k =0,1,..., is 
obtained from applying the projection method. There exists an accumulation point &% of this 
sequence and a piecewise constant function defined by i(t) = i; for t € [t;, t;,,). Also, for 


u*(t) € T* where T* = {u(.)| < Z(uC.)), v6) —u(.) >= 0, v(. )e K} we have: 
llu*—Z||* < OCA), (14) 


where ||[u—v|| = max |u; —v;|. 
Osis N-1 
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4. Simulation results 


This section will test the method on an example and a case study. 


4.1. An example 
Consider the following OCP Pulova (2009): 


min fKbeo + u?(t)]dt 
s.t. x =—ax(t) + Bu(t) 
x(0) =1 (15) 
Ju] <1 


The analytical optimal solution to this problem is 


where, 
Yr, = va? +1, YT, = —Va? +1, 

1 1 
y= C2 = 


= 11—-a—(r2—-a)eT1-72” a Y2—-a—(1r1—-a)eT1-72" 


We solve the problem by setting a = 1, N = 100, h = 0.01, and t; = ih fori = 0,1,...,N. 
The transient behaviour of the optimal solution of the control variable is given in Figure 1. As 
you can see we choose the initial value from out of the feasible region (ug = —2) and the 


solution converges to the optimal solution. This is the advantage of using the projection method. 


0 ° 
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(a) Trajectory of u(t;) and u”*. (b) Trajectory of u(ti) and u*. 


Figure 1. Trajectories of control vector 


4.2. Case study: production planning in reverse logistics system 


Finished products and manufacturing plants are some elements of the production system in 


SC, and there are other manufacturing plants. They produce work in process and finished 
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products and hold them in warehouses. So, they need to plan and control production and 
inventories. Isolated planning and control by different manufacturers increase inventories in 
them, and then they must plan and control integratory. The application in management science 
consists of the control of dynamics, i.e., continuous or discrete-time systems are such systems. 
The difference between these systems depends on whether time varies continuously or 
discretely. These systems are an important research area in management (Sethi and Thompson, 
2000; Kistner and Dobos, 2000; Tang and et al., 2021; Vicil, 2021). The exciting topic in this 
area is the application of optimal control theory to the product inventory system. 

Here, we are going to solve the OCP with the proposed method. The OCP was modelled 
based on production planning in a three-stock reverse logistics system with deteriorating items 


(11). Assume some definitions from Hedjar et al. (2015) as follows: 


I(t): Inventory of remanufacturing at time t. 
I(t): Inventory of manufacturing at time t. 
I(t): Inventory of returned items at time t. 
u,(t): Level of remanufacturing at time t. 
Um(t): Level of manufacturing at time t. 


ugq(t): Level of disposal at time t. 


From Hedjar et al. (2015), the control and the state vectors are as u(t) = 
(A um(t). Au,(t). Aug (t))’ and x(t) = (Al, (t). Al, (t).Al,(t)) ; respectively, where 
Alm (t) = Im(t)— Im(@) 
Alt) = 1) - 1.© 
Al,(t) = I) - LO 
Aum(t) = Um(t) — Un (t) 
Au,(t) = u,(t) — u(t) 


Aug(t) = ug(t) — ua(t) 


Also, "7" shows the target value of the variables. The following OCP is given in Hedjar et 


al. (2015): 
17" 
min J = al [mA Lal) + qrA Te) + qr Al, (t) + Tm AUm (t) + Thu) + rgAug(t)] 
0 


s.t. SN Um (t) — Om Alm (t) 


=A uy(O) ~ Orly) 
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Se =A ur(t) ~ Aua(®) 


Alm(0) =12,,  Al,(0) =1°, Al,(0) = I°. 
The OCP can be restated as the following matrix form: 
1 7% 
min J= 3 I (xT (t)Qx(t) + ul (t)Ru(t))dt 
0 


s.t. x = A(t)x(t) + B(Hu(t) 


x(0) = Xo 
|u| < 10. 
where, 
Gm 0 O ™ O O -O, 0 0 
Q=|0 q, O|, R=|0 7% O}, A=] 0 -6, OJ, 
0 O 0 O Tq 0 0 O 
1 0 O Al7,(0) 
B=|0 1 0}, Xo =| Al-(0) 
0 -1 -1 AI,,(0) 
Now, assume the given values in Table | from Hedjar et al. (2015). 
Table 1. The given parameters and initial states 
Parameter value Parameter value Parameter value Parameter value Parameter value 
Al, (0) 15 Gn it On 0.01 Tn 0.1 Tq 0.3 
Al,.(0) 10 Or 2 6, 0.02 Y; 0.2 tf 1.2 
Al,(0) 5 CE 3 


Employing the proposed method gives Figure 2 depicting the optimal control and state 


variables trajectories. 
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(a) Trajectory of AI, (t). (b) Trajectory of Au, (rt). 
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(c) Trajectory of AI,_(t). (d) Trajectory of Au,(t)- 


Oaba | (0 


“e 22 24 os os ' 12 ° az o os os 1 12 
ie tre 


(e) Trajectory of Al, (t)- (f) Trajectory of Au,(t)- 
Figure 2. Trajectories of state and control vectors 
The solutions tend to be zero, similar to the obtained results in Hedjar et al. (2015). Hedjar et 


al. (2015) used the predictive control approach for solving the presented OCP. 


5. Conclusion 


This article presented an iterative approach to solving the linear quadratic optimal control 
problem with bounded control variables. The challenges of the optimal control problems were 
the bounded control variables so that conventional techniques could not be applied. The 
iterative approach presented in this paper guaranteed the uniform convergence of the solution 
for the problem. We applied the projection function to construct the approximation method. 
Employing the projection function had other advantages: we could select the initial value from 
out of the region. Finally, a case study on production planning in a reverse logistics system with 


deteriorating items was given and solved based on the proposed method. 


Disclosure statement 
No potential conflict of interest was reported by the author(s). 


Eshaghnezhad et al., JSTINP 2023; Vol. 2. No. 1 DOT: 10.22067/JSINP.2023.81194.1037 


JOURNAL OF SYSTEMS THINKING IN PRACTICE RESEARCH ARTICLE 


References 
Effati, S., Mansoori, A. and Eshaghnezhad, M., 2021. Linear quadratic optimal control problem with 


fuzzy variables via neural network. Journal of Experimental & Theoretical Artificial Intelligence, 
33(2), pp.283-296. https://doi.org/10.1080/0952813X.2020.1737245. 


Eshaghnezhad, M., Effati, S. and Mansoori, A., 2022. A compact MLCP-based projection 
recurrent neural network model to solve shortest path problem. Journal of Experimental & 
Theoretical Artificial Intelligence, pp.1-19. https://doi.org/10.1080/0952813X.2022.2067247. 


Géllmann, L., Kern, D. and Maurer, H., 2009. Optimal control problems with delays in state and control 
variables subject to mixed control-state constraints. Optimal Control Applications and Methods, 30(4), 
pp.341-365. https://doi.org/10.1002/oca.843. 


Hedjar, R., Garg, A.K. and Tadj, L., 2015. Model predictive production planning in a three-stock 
reverse-logistics system with deteriorating items. IJnternational Journal of Systems Science: 
Operations & Logistics, 2(4), pp.187-198. https://doi.org/10.1080/23302674.2015.1015661. 


Kharatishvili, G.L., 1961. The maximum principle in the theory of optimal processes involving delay. 
In Doklady Akademii Nauk (Vol. 136, No. 1, pp. 39-42). Russian Academy of Sciences. 


Kistner, K.P. and Dobos, I., 2000. Optimal production-inventory strategies for a reverse logistics system. 
Optimization, Dynamics, and Economic Analysis: Essays in Honor of Gustav Feichtinger, pp.246-258. 
https://doi.org/10.1007/978-3-642-57684-3 21. 


Kovacevic, R.M., Stilianakis, N.I. and Veliov, V.M., 2022. A distributed optimal control model applied 
to COVID-19 pandemic. SI4M Journal on Control and Optimization, 60(2), pp.S221-S245. 
https://doi.org/10.1137/20M 1373840. 


Liu, Y., Jian, S. and Gao, J., 2022. Dynamics analysis and optimal control of SIVR epidemic model with 
incomplete immunity. Advances in Continuous and Discrete Models, 2022(1), pp.1-22. 
https://doi.org/10.1186/s13662-022-03723-7. 


Mansoori, A. and Effati, S., 2019. Parametric NCP-based recurrent neural network model: A 
new strategy to solve fuzzy nonconvex optimization problems. JEEE Transactions on Systems, 
Man, and Cybernetics: Systems, 51(4), pp.2592-2601. 
https://doi.org/10.1109/TSMC.2019.2916750. 


Naidu, D.S., 2003. Optimal control systems, by CRC Press LLC. 


Ojo, M.M., Benson, T.O., Peter, O.J. and Goufo, E.F.D., 2022. Nonlinear optimal control strategies for 
a mathematical model of COVID-19 and influenza co-infection. Physica A: Statistical Mechanics and 
its Applications, 607, p.128173. https://doi.org/10.1016/j.physa.2022.128173. 


Pontryagin, L.S., Boltyanskii, V.G., Gamkrelidze, R.V. and Mishchenko, E.F., 1962. The maximum 
principle. The Mathematical Theory of Optimal Processes. New York: John Wiley and Sons. 


Pooya, A. and Pakdaman, M., 2017. Analysing the solution of production-inventory optimal control 
systems by neural networks. RAIRO-Operations Research, 51(3), — pp.577-590. 
https://doi.org/10.1051/ro/2016044. 


Pooya, A. and Pakdaman, M., 2018. A delayed optimal control model for multi-stage production- 
inventory system with production lead times. The International Journal of Advanced Manufacturing 
Technology, 94, pp.751-761. https://doi.org/10.1007/s00170-017-0942-5. 


Eshaghnezhad et al., JSTINP 2023; Vol. 2. No. 1 DOT: 10.22067/JSINP.2023.81194.1037 


Optimal Control Problem on Production Planning in the Reverse Logistics System JSTINP 


Pooya, A. and Pakdaman, M., 2019. Optimal control model for finite capacity continuous MRP with 
deteriorating items. Journal of Intelligent Manufacturing, 30,  pp.2203-2215. 
https://doi.org/10.1007/s10845-017-1383-6. 


Pooya, A., Mansoori, A., Eshaghnezhad, M. and Ebrahimpour, S.M., 2021. Neural network for a novel 
disturbance optimal control model for inventory and production planning in a four-echelon supply 
chain with reverse logistic. Neural Processing Letters,  53(6), pp.4549-4570. 
https://doi.org/10.1007/s11063-021-10612-9. 


Pulova, N.V., 2009. A pointwise projected gradient method applied to an optimal control problem. 
Journal of computational and applied mathematics, 226(2), pp.331-335. 
https://doi.org/10.1016/j.cam.2008.08.007. 


Sethi, P., Thompson, G. L., 2000. Optimal Control Theory, Applications to Management Science and 
Economics. 2nd Edn., Springer, USA., Inc., (2000). 


Sweilam, N.H., Al-Mekhlafi, S.M. and Shatta, S.A., 2021. Optimal bang-bang control for variable-order 
dengue virus; numerical studies. Journal of Advanced Research, 32,  pp.37-44. 
https://doi.org/10.1016/j.jare.2021.03.010. 


Sweilam, N.H., Al-Mekhlafi, S.M., Albalawi, A.O. and Machado, J.T., 2021. Optimal control of 
variable-order fractional model for delay cancer treatments. Applied Mathematical Modelling, 89, 
pp.1557-1574. https://doi.org/10.1016/j.apm.2020.08.012. 


Tang, L., Yang, T., Tu, Y. and Ma, Y., 2021. Supply chain information sharing under consideration of 
bullwhip effect and system robustness. Flexible Services and Manufacturing Journal, 33, pp.337-380. 
https://doi.org/10.1007/s10696-020-09384-6. 


Vicil, O., 2021. Optimizing stock levels for service-differentiated demand classes with inventory 
rationing and demand lead times. Flexible Services and Manufacturing Journal, 33(2), pp.381-424. 
https://doi.org/10.1007/s10696-020-09378-4. 


Eshaghnezhad et al., JSTINP 2023; Vol. 2. No. 1 DOT: 10.22067/JSINP.2023.81194.1037 


